https://www.r-bloggers.com/summary-of-community-detection-algorithms-in-igraph-0-6/
Community Detection in graphs - community detection algorithm implemented in igraph :
edge-betweennes.community(w,-d) - hierarchical decomposition process, very slow
walktrap.community (w,-d) - based on random walks, in my case generates to many to small clusters
fastgreedy.community(w) - bottom-up, suffer from a resolution limit
spinglass.community (w,d, not for unconnected graph) - approach from statistical physics, based on the so-called Potts model; It is not guaranteed that nodes in completely remote (or disconencted) parts of the networks have different spin states; in my case without set.seed(Random Number Generation) option generates different assignments
infomap.community (w,d) - in my case, works only on undirected, simplified (loops and multiple relations are removed) LKN/CKN with 128 (2^7) iterations and set.seed()
label.propagation.community(w) - very fast but yields different results based on the initial configuration (which is decided randomly)
multivel.community(w) - cluster_louvain(g)
leading.eigenvector.community (w) - top-down hierarchical approach
Pipeline:
1. Cluster LKN/CKN using Multi-Level algorithm (https://arxiv.org/abs/0803.0476)
2. After Multi-Level, subnetworks from big.0.0 of cluster size [2^10, max) cluster with spinglass algorithm according to the number of hub nodes and plot separately.
4.BE CAREFUL: spinglass algorithm works only on connected graphs. Each time before spinglass clustering check if subgraph disintegrate to weak components.
https://lists.nongnu.org/archive/html/igraph-help/2010-04/msg00076.html - layout repeatability in R and spinglass.community function?
“…it gives different results each time because it is starting from different random start-points each time.”
if (!require("igraph")) install.packages("igraph")
library(igraph)
if (!require("tictoc")) install.packages("tictoc")
require(tictoc)
if (!require("gtools")) install.packages("gtools")
library(gtools)
if (!require("reshape2")) install.packages("reshape2")
library(reshape2)
if (!require("gmp")) install.packages("gmp")
library(gmp)
tic()set.seed(123456)Warnings in igraph because of bug - fix
autocurve.edges <- function(graph, start=0.5) {
el <- apply(get.edgelist(graph, names = FALSE), 1, paste, collapse = ":")
ave(rep(NA, length(el)), el, FUN = function(x) {
if (length(x) == 1) {
return(0)
} else {
return(seq(-start, start, length = length(x)))
}
})
}'%ni%' = Negate('%in%') myname = your network.graphml
# https://string-db.org/cgi/network.pl?taskId=DkhvPMjSF0rr
# myname = 'STRING_PR1.graphml' #
myname = 'erdos.renyi.game_10000_0.001.graphml'
dir0 = getwd()g = read_graph(paste0(myname), format = "graphml")
print(summary(g))## IGRAPH 7932c48 DN-- 10000 49742 --
## + attr: SUID (g/n), shared name (g/c), name (g/c), selected (g/l),
## | __Annotations (g/c), type (g/c), loops (g/l), p (g/n), SUID
## | (v/n), shared name (v/c), name (v/c), selected (v/l), geneID
## | (v/c), MapManBin (v/c), shortName (v/c), shortDescription (v/c),
## | id (v/c), SUID (e/n), shared name (e/c), shared interaction
## | (e/c), name (e/c), selected (e/l), interaction (e/c),
## | reactionType (e/c), geneID1 (e/c), geneID2 (e/c)
## IGRAPH 7932c48 DN-- 10000 49742 --
## + attr: SUID (g/n), shared name (g/c), name (g/c), selected (g/l),
## | __Annotations (g/c), type (g/c), loops (g/l), p (g/n), SUID
## | (v/n), shared name (v/c), name (v/c), selected (v/l), geneID
## | (v/c), MapManBin (v/c), shortName (v/c), shortDescription (v/c),
## | id (v/c), SUID (e/n), shared name (e/c), shared interaction
## | (e/c), name (e/c), selected (e/l), interaction (e/c),
## | reactionType (e/c), geneID1 (e/c), geneID2 (e/c)
## + edges from 7932c48 (vertex names):
## [1] node1->node577 node1->node953 node1->node1210 node1->node2085
## [5] node1->node2806 node1->node4084 node1->node5441 node1->node5463
## + ... omitted several edges
list.vertex.attributes(g)## [1] "SUID" "shared name" "name"
## [4] "selected" "geneID" "MapManBin"
## [7] "shortName" "shortDescription" "id"
list.edge.attributes(g)## [1] "SUID" "shared name" "shared interaction"
## [4] "name" "selected" "interaction"
## [7] "reactionType" "geneID1" "geneID2"
myname = unlist(strsplit(myname, "[.]"))[1]
dir1 = paste(dir0,'/clusteringResults',sep = '')
ifelse(!dir.exists(dir1), dir.create(dir1), FALSE)## [1] TRUE
Graph summary
A scale-free network is a network whose degree distribution follows a power law, at least asymptotically.
CHECK all the vertex and edge attributes! This is important at the end. The graphical parameters/attributes will not be saved, all others will. Also be careful which column is short NIB ZR biological name.
V(g)$names = V(g)$geneID
V(g)$nodeID = V(g)$names
V(g)$numVec = seq(1,length(V(g)$id),1)
# list.vertex.attributes(g)
# list.edge.attributes(g)
quantile(degree(g), c(0.5, 0.75, 0.90, 0.95, 0.975, 0.99, 0.995, 0.9987, 1.0))## 50% 75% 90% 95% 97.5% 99% 99.5% 99.87% 100%
## 10.000 12.000 14.000 15.000 16.025 18.000 19.000 21.000 24.000
which(degree(g) > 0.6*max(degree(g)))## node12 node38 node66 node87 node94 node133 node135 node139
## 12 38 66 87 94 133 135 139
## node197 node217 node236 node246 node262 node312 node323 node326
## 197 217 236 246 262 312 323 326
## node335 node338 node343 node345 node346 node349 node359 node379
## 335 338 343 345 346 349 359 379
## node382 node383 node392 node410 node418 node420 node436 node441
## 382 383 392 410 418 420 436 441
## node455 node458 node479 node481 node485 node492 node499 node506
## 455 458 479 481 485 492 499 506
## node535 node540 node564 node632 node641 node658 node659 node663
## 535 540 564 632 641 658 659 663
## node669 node681 node683 node686 node702 node719 node726 node743
## 669 681 683 686 702 719 726 743
## node765 node770 node773 node789 node790 node793 node795 node826
## 765 770 773 789 790 793 795 826
## node845 node849 node873 node875 node888 node897 node900 node920
## 845 849 873 875 888 897 900 920
## node935 node943 node944 node950 node957 node972 node975 node995
## 935 943 944 950 957 972 975 995
## node1002 node1021 node1065 node1072 node1103 node1114 node1134 node1136
## 1002 1021 1065 1072 1103 1114 1134 1136
## node1159 node1167 node1172 node1214 node1233 node1234 node1239 node1253
## 1159 1167 1172 1214 1233 1234 1239 1253
## node1274 node1276 node1285 node1306 node1308 node1313 node1314 node1316
## 1274 1276 1285 1306 1308 1313 1314 1316
## node1357 node1369 node1385 node1400 node1413 node1415 node1416 node1429
## 1357 1369 1385 1400 1413 1415 1416 1429
## node1441 node1444 node1471 node1505 node1510 node1527 node1528 node1537
## 1441 1444 1471 1505 1510 1527 1528 1537
## node1539 node1548 node1558 node1610 node1612 node1620 node1635 node1643
## 1539 1548 1558 1610 1612 1620 1635 1643
## node1649 node1653 node1654 node1679 node1680 node1725 node1745 node1746
## 1649 1653 1654 1679 1680 1725 1745 1746
## node1756 node1762 node1763 node1768 node1802 node1807 node1811 node1815
## 1756 1762 1763 1768 1802 1807 1811 1815
## node1833 node1857 node1858 node1870 node1876 node1884 node1917 node1919
## 1833 1857 1858 1870 1876 1884 1917 1919
## node1943 node1972 node1985 node1988 node1995 node2001 node2003 node2013
## 1943 1972 1985 1988 1995 2001 2003 2013
## node2042 node2043 node2059 node2071 node2086 node2087 node2090 node2092
## 2042 2043 2059 2071 2086 2087 2090 2092
## node2100 node2102 node2108 node2130 node2157 node2169 node2183 node2192
## 2100 2102 2108 2130 2157 2169 2183 2192
## node2193 node2217 node2243 node2246 node2267 node2274 node2299 node2306
## 2193 2217 2243 2246 2267 2274 2299 2306
## node2307 node2314 node2328 node2331 node2338 node2349 node2353 node2356
## 2307 2314 2328 2331 2338 2349 2353 2356
## node2378 node2388 node2389 node2412 node2419 node2421 node2506 node2514
## 2378 2388 2389 2412 2419 2421 2506 2514
## node2517 node2541 node2546 node2551 node2594 node2600 node2610 node2615
## 2517 2541 2546 2551 2594 2600 2610 2615
## node2622 node2628 node2661 node2667 node2678 node2679 node2703 node2733
## 2622 2628 2661 2667 2678 2679 2703 2733
## node2735 node2775 node2799 node2806 node2808 node2818 node2819 node2824
## 2735 2775 2799 2806 2808 2818 2819 2824
## node2832 node2837 node2848 node2866 node2890 node2897 node2924 node2927
## 2832 2837 2848 2866 2890 2897 2924 2927
## node2939 node2942 node2963 node2971 node2981 node2988 node3004 node3025
## 2939 2942 2963 2971 2981 2988 3004 3025
## node3027 node3044 node3052 node3060 node3081 node3087 node3092 node3096
## 3027 3044 3052 3060 3081 3087 3092 3096
## node3111 node3151 node3152 node3154 node3171 node3179 node3195 node3204
## 3111 3151 3152 3154 3171 3179 3195 3204
## node3226 node3229 node3234 node3237 node3269 node3282 node3289 node3306
## 3226 3229 3234 3237 3269 3282 3289 3306
## node3323 node3332 node3346 node3352 node3376 node3386 node3409 node3419
## 3323 3332 3346 3352 3376 3386 3409 3419
## node3422 node3432 node3475 node3477 node3496 node3498 node3514 node3517
## 3422 3432 3475 3477 3496 3498 3514 3517
## node3525 node3532 node3540 node3550 node3561 node3585 node3590 node3601
## 3525 3532 3540 3550 3561 3585 3590 3601
## node3616 node3626 node3634 node3650 node3698 node3716 node3717 node3752
## 3616 3626 3634 3650 3698 3716 3717 3752
## node3766 node3772 node3787 node3816 node3823 node3850 node3898 node3911
## 3766 3772 3787 3816 3823 3850 3898 3911
## node3938 node3950 node3974 node3979 node3987 node4004 node4006 node4020
## 3938 3950 3974 3979 3987 4004 4006 4020
## node4024 node4055 node4078 node4079 node4110 node4118 node4137 node4154
## 4024 4055 4078 4079 4110 4118 4137 4154
## node4155 node4163 node4171 node4211 node4230 node4274 node4288 node4305
## 4155 4163 4171 4211 4230 4274 4288 4305
## node4312 node4313 node4323 node4329 node4341 node4367 node4370 node4380
## 4312 4313 4323 4329 4341 4367 4370 4380
## node4410 node4416 node4435 node4440 node4450 node4455 node4460 node4478
## 4410 4416 4435 4440 4450 4455 4460 4478
## node4484 node4485 node4497 node4514 node4518 node4524 node4561 node4572
## 4484 4485 4497 4514 4518 4524 4561 4572
## node4600 node4626 node4648 node4649 node4656 node4690 node4695 node4696
## 4600 4626 4648 4649 4656 4690 4695 4696
## node4700 node4706 node4708 node4720 node4726 node4729 node4731 node4736
## 4700 4706 4708 4720 4726 4729 4731 4736
## node4740 node4748 node4759 node4768 node4778 node4781 node4798 node4804
## 4740 4748 4759 4768 4778 4781 4798 4804
## node4806 node4813 node4836 node4839 node4850 node4871 node4898 node4917
## 4806 4813 4836 4839 4850 4871 4898 4917
## node4922 node4927 node4933 node4942 node4972 node4973 node4979 node4997
## 4922 4927 4933 4942 4972 4973 4979 4997
## node5000 node5014 node5025 node5059 node5061 node5064 node5065 node5066
## 5000 5014 5025 5059 5061 5064 5065 5066
## node5073 node5097 node5107 node5113 node5120 node5125 node5136 node5149
## 5073 5097 5107 5113 5120 5125 5136 5149
## node5154 node5181 node5202 node5210 node5242 node5259 node5274 node5298
## 5154 5181 5202 5210 5242 5259 5274 5298
## node5299 node5311 node5314 node5338 node5365 node5391 node5414 node5420
## 5299 5311 5314 5338 5365 5391 5414 5420
## node5447 node5457 node5468 node5470 node5483 node5503 node5515 node5556
## 5447 5457 5468 5470 5483 5503 5515 5556
## node5557 node5570 node5572 node5581 node5608 node5611 node5620 node5627
## 5557 5570 5572 5581 5608 5611 5620 5627
## node5639 node5641 node5648 node5661 node5663 node5667 node5670 node5679
## 5639 5641 5648 5661 5663 5667 5670 5679
## node5682 node5688 node5717 node5760 node5772 node5779 node5782 node5792
## 5682 5688 5717 5760 5772 5779 5782 5792
## node5804 node5808 node5813 node5822 node5846 node5895 node5896 node5905
## 5804 5808 5813 5822 5846 5895 5896 5905
## node5912 node5915 node5928 node5939 node5957 node5965 node6008 node6024
## 5912 5915 5928 5939 5957 5965 6008 6024
## node6030 node6049 node6057 node6067 node6075 node6092 node6123 node6124
## 6030 6049 6057 6067 6075 6092 6123 6124
## node6135 node6164 node6177 node6190 node6205 node6228 node6247 node6253
## 6135 6164 6177 6190 6205 6228 6247 6253
## node6260 node6268 node6281 node6294 node6295 node6296 node6317 node6349
## 6260 6268 6281 6294 6295 6296 6317 6349
## node6363 node6365 node6370 node6423 node6429 node6456 node6468 node6475
## 6363 6365 6370 6423 6429 6456 6468 6475
## node6483 node6501 node6503 node6505 node6519 node6526 node6528 node6531
## 6483 6501 6503 6505 6519 6526 6528 6531
## node6555 node6575 node6583 node6592 node6654 node6657 node6665 node6666
## 6555 6575 6583 6592 6654 6657 6665 6666
## node6672 node6678 node6717 node6721 node6734 node6740 node6743 node6808
## 6672 6678 6717 6721 6734 6740 6743 6808
## node6832 node6849 node6858 node6885 node6888 node6894 node6899 node6918
## 6832 6849 6858 6885 6888 6894 6899 6918
## node6920 node6923 node6928 node6929 node6932 node6960 node6989 node7003
## 6920 6923 6928 6929 6932 6960 6989 7003
## node7018 node7022 node7029 node7054 node7059 node7072 node7092 node7112
## 7018 7022 7029 7054 7059 7072 7092 7112
## node7127 node7138 node7153 node7155 node7163 node7170 node7187 node7191
## 7127 7138 7153 7155 7163 7170 7187 7191
## node7197 node7199 node7205 node7208 node7214 node7217 node7218 node7236
## 7197 7199 7205 7208 7214 7217 7218 7236
## node7257 node7267 node7269 node7283 node7306 node7312 node7314 node7331
## 7257 7267 7269 7283 7306 7312 7314 7331
## node7341 node7355 node7361 node7362 node7371 node7414 node7434 node7452
## 7341 7355 7361 7362 7371 7414 7434 7452
## node7456 node7465 node7468 node7490 node7508 node7509 node7525 node7541
## 7456 7465 7468 7490 7508 7509 7525 7541
## node7565 node7570 node7578 node7580 node7583 node7604 node7616 node7623
## 7565 7570 7578 7580 7583 7604 7616 7623
## node7626 node7634 node7638 node7651 node7657 node7668 node7671 node7687
## 7626 7634 7638 7651 7657 7668 7671 7687
## node7691 node7697 node7702 node7705 node7714 node7718 node7722 node7726
## 7691 7697 7702 7705 7714 7718 7722 7726
## node7740 node7742 node7751 node7757 node7770 node7774 node7781 node7785
## 7740 7742 7751 7757 7770 7774 7781 7785
## node7788 node7792 node7808 node7832 node7859 node7865 node7889 node7950
## 7788 7792 7808 7832 7859 7865 7889 7950
## node7959 node7968 node7972 node7984 node8016 node8018 node8043 node8049
## 7959 7968 7972 7984 8016 8018 8043 8049
## node8065 node8075 node8091 node8100 node8106 node8107 node8113 node8124
## 8065 8075 8091 8100 8106 8107 8113 8124
## node8127 node8138 node8139 node8142 node8153 node8165 node8174 node8179
## 8127 8138 8139 8142 8153 8165 8174 8179
## node8180 node8199 node8206 node8211 node8213 node8214 node8222 node8233
## 8180 8199 8206 8211 8213 8214 8222 8233
## node8244 node8273 node8305 node8318 node8321 node8335 node8341 node8342
## 8244 8273 8305 8318 8321 8335 8341 8342
## node8351 node8359 node8360 node8371 node8377 node8381 node8394 node8403
## 8351 8359 8360 8371 8377 8381 8394 8403
## node8405 node8410 node8426 node8437 node8446 node8470 node8471 node8472
## 8405 8410 8426 8437 8446 8470 8471 8472
## node8496 node8519 node8521 node8540 node8543 node8547 node8551 node8568
## 8496 8519 8521 8540 8543 8547 8551 8568
## node8577 node8596 node8608 node8627 node8634 node8636 node8646 node8653
## 8577 8596 8608 8627 8634 8636 8646 8653
## node8654 node8664 node8670 node8680 node8708 node8716 node8722 node8725
## 8654 8664 8670 8680 8708 8716 8722 8725
## node8735 node8755 node8759 node8767 node8781 node8798 node8809 node8816
## 8735 8755 8759 8767 8781 8798 8809 8816
## node8825 node8831 node8836 node8839 node8850 node8872 node8875 node8883
## 8825 8831 8836 8839 8850 8872 8875 8883
## node8904 node8906 node8912 node8931 node8933 node8941 node8943 node8987
## 8904 8906 8912 8931 8933 8941 8943 8987
## node9011 node9018 node9021 node9036 node9037 node9046 node9069 node9091
## 9011 9018 9021 9036 9037 9046 9069 9091
## node9112 node9141 node9144 node9145 node9153 node9167 node9170 node9177
## 9112 9141 9144 9145 9153 9167 9170 9177
## node9185 node9192 node9230 node9234 node9248 node9293 node9297 node9303
## 9185 9192 9230 9234 9248 9293 9297 9303
## node9306 node9316 node9337 node9359 node9360 node9374 node9376 node9390
## 9306 9316 9337 9359 9360 9374 9376 9390
## node9403 node9410 node9453 node9458 node9461 node9503 node9507 node9519
## 9403 9410 9453 9458 9461 9503 9507 9519
## node9553 node9560 node9581 node9583 node9584 node9590 node9607 node9630
## 9553 9560 9581 9583 9584 9590 9607 9630
## node9643 node9644 node9648 node9649 node9653 node9661 node9663 node9666
## 9643 9644 9648 9649 9653 9661 9663 9666
## node9678 node9703 node9710 node9735 node9742 node9768 node9824 node9836
## 9678 9703 9710 9735 9742 9768 9824 9836
## node9846 node9854 node9871 node9872 node9873 node9902 node9907 node9926
## 9846 9854 9871 9872 9873 9902 9907 9926
## node9932 node9934 node9945 node9958 node9976 node9988 node9994 node9999
## 9932 9934 9945 9958 9976 9988 9994 9999
# just for drawing purposes
# V(g)$size = rep(40,length(V(g)$size))
# V(g)$size = sample(seq(20,40,1), vcount(g), replace = TRUE)Finding community structure by multi-level optimization of modularity
https://www.r-bloggers.com/summary-of-community-detection-algorithms-in-igraph-0-6/
https://arxiv.org/abs/0803.0476
fastgreedy.community merges pairs of communities iteratively, always choosing the pair that yields the maximum increase in the overall modularity. In multilevel.community, communities are not merged; instead of that, nodes are moved between communities such that each node makes a local decision that maximizes its own contribution to the modularity score. When this procedure gets stuck (i.e. none of the nodes change their membership), then all the communities are collapsed into single nodes and the process continues (that’s why it is multilevel).
On undirected, simplified network (LKN/CKN)
Finding community structure by multi-level optimization of modularity
Description: This function implements the multi-level modularity optimization algorithm for finding community structure, see references below. It is based on the modularity measure and a hierarchial approach.
Usage: cluster_louvain(graph, weights = NULL)
BONUS: add overall degree as vertex attribute
################################################################################
# The number of edges remains constant,
# an undirected edge is created for each directed one,
# this version might create graphs with multiple edges.
gu = as.undirected(g, mode = "each")
# summary(gu)
gs = simplify(gu,
remove.multiple = TRUE,
remove.loops = TRUE,
edge.attr.comb = "concat")
# Concatenate the attributes, using the c function.
# This results almost always a complex attribute.
# summary(gs)
V(gu)$numVec = seq(1,length(V(gu)$id),1)
V(gs)$numVec = seq(1,length(V(gs)$id),1)
V(gs)$degreeFullSimplified = degree(gs, loops = FALSE,
normalized = FALSE, mode = "all")
V(g)$degreeFullSimplified = degree(gs, loops = FALSE,
normalized = FALSE, mode = "all")
# tic()
cl = multilevel.community(gs)
# exectime <- toc()
is_hierarchical(cl)## [1] FALSE
names(cl)## [1] "membership" "memberships" "modularity" "names" "vcount"
## [6] "algorithm"
# cl$membership
# cl$nameshttps://en.wikipedia.org/wiki/Power_law http://stackoverflow.com/questions/21541240/goodness-of-fit-test-for-power-law-distribution-in-r https://cran.r-project.org/web/packages/poweRlaw/
For degree_distribution a numeric vector of the same length as the maximum degree plus one. The first element is the relative frequency zero degree vertices, the second vertices with degree one, etc.
dd = degree_distribution(gs, v = V(gs), loops = FALSE,
normalized = FALSE, mode = "all")
plot(sort(dd, decreasing = TRUE))ddd = degree(gs, v = V(gs))
plot(sort(ddd, decreasing = TRUE))max(ddd) + 1 == length(dd)## [1] TRUE
plf = power.law.fit(dd, impelementation = "plfit")
print(plf)## $continuous
## [1] TRUE
##
## $alpha
## [1] 1.238736
##
## $xmin
## [1] 2e-04
##
## $logLik
## [1] 50.4867
##
## $KS.stat
## [1] 0.2447133
##
## $KS.p
## [1] 0.1001378
plf$alpha## [1] 1.238736
xxm = plf$xmin + 0.01
alpha = vector(length = 0)
xm = vector(length = 0)
powerl = vector(length = 0)
powerl3 = vector(length = 0)
for (i in seq(xxm, max(dd), 0.01)) {
plf = power.law.fit(dd, impelementation = "plfit", xmin = i)
alpha[length(alpha) + 1] = plf$alpha
xm[length(xm) + 1] = i
powerl[length(powerl) + 1] = (round(plf$alpha) == 2)
powerl3[length(powerl3) + 1] = (round(plf$alpha) == 3)
}
# closest value
alpha[ which.min(abs(alpha - 2.00))]## [1] 1.969857
xm[ which.min(abs(alpha - 2.00))]## [1] 0.0302
aalpha = alpha[alpha >= 2.0]
aalpha[ which.min(abs(aalpha - 2.00))]## [1] 2.227492
xm[ which.min(abs(aalpha - 2.00))]## [1] 0.0102
aalpha = alpha[alpha <= 3]
aalpha[ which.min(abs(aalpha - 3))]## [1] 2.843998
xm[ which.min(abs(aalpha - 3))]## [1] 0.0602
Calculate some topological network parameters (http://www.nature.com/nprot/journal/v7/n4/full/nprot.2012.004.html) like centralisation, clustering coeff., number of hub nodes (http://www.nature.com/articles/srep08665)
http://igraph.org/r/doc/centralize.html
For degree, closeness and betweenness the most centralized structure is some version of the star graph, in-star, out-star or undirected star.
For eigenvector centrality the most centralized structure is the graph with a single edge (and potentially many isolates).
Be CAREFUL: network with 3 nodes or less cannot be plotted!
t = (as.data.frame(table(cl$membership)))
t[,1] = as.numeric(t[,1])
t[,2] = as.numeric(t[,2])
d = t[with(t, order(-Freq, Var1)), ]
dim(d)[1] == length(unique(cl$membership))## [1] TRUE
sum(d[,2]) == vcount(g)## [1] TRUE
# singletons
sum((t[t[,2] < 2, ])[,2])## [1] 1
# duos
sum((t[t[,2] == 2, ])[,2])## [1] 0
# trios
sum((t[t[,2] == 3, ])[,2])## [1] 0
# quartets
sum((t[t[,2] == 4, ])[,2])## [1] 0
V(g)$membership <- cl$membership
V(gu)$membership <- cl$membership
V(gs)$membership <- cl$membership
hist(cl$membership, breaks = length(unique(cl$membership)))toSmallInd = d[(d[,2] <= 2^2),1]
d = d[d[,2] > 2^2,] # 3 doesn't plot
dim(t)[1]## [1] 25
cat('some clusterID size')## some clusterID size
print(d)## Var1 Freq
## 1 1 818
## 21 21 779
## 25 25 663
## 16 16 622
## 19 19 574
## 17 17 556
## 11 11 528
## 10 10 494
## 22 22 492
## 7 7 420
## 6 6 399
## 18 18 393
## 9 9 377
## 12 12 359
## 3 3 335
## 23 23 324
## 4 4 314
## 14 14 303
## 24 24 260
## 2 2 209
## 20 20 209
## 13 13 207
## 8 8 201
## 5 5 163
mydf = matrix(NA, ncol = 36, nrow = dim(d)[1])
colnames(mydf) = c('cliID', 'vcountD', 'ecountD', 'connectedD',
'starD','-Log10qD', 'evD', 'hDegD',
'vcountS', 'ecountS', 'centralityS',
'starS','-Log10qS', 'evS', 'hDegS',
'vcountGSS', 'ecountGSS', 'centralityGSS',
'starGSS','-Log10qGSS', 'evGSS', 'hDegGSS',
'vShrinkage', 'eShrinkage',
'centr_degreeD', 'centr_cloD' , 'centr_betwD',
'centr_degreeS', 'centr_cloS' , 'centr_betwS',
'centr_degreeGSS', 'centr_cloGSS', 'centr_betwGSS',
'-Log10edge_densityD',
'-Log10edge_densityS',
'-Log10edge_densityGSS')
d[,1] = as.numeric(d[,1])
d[,2] = as.numeric(d[,2])
for (i in 1:dim(d)[1]) {
j = as.numeric(d[i,1])
a = which(cl$membership == j) # cl[[j]] #### #### #### #### #### ####
tmpNetD = induced.subgraph(graph = g, vids = a)
# do not want loops and multiple edges i count
tmpNetS = induced.subgraph(graph = gs, vids = a)
gss = delete.vertices(tmpNetS, which(degree(tmpNetS) < 2))
vcS = vcount(tmpNetS)
ecS = ecount(tmpNetS)
vcD = vcount(tmpNetD)
ecD = ecount(tmpNetD)
vcgss = vcount(gss)
ecgss = ecount(gss)
starS = max(centr_degree(tmpNetS, loops = FALSE,
normalized = TRUE, mode = 'all')$centralization,
centr_clo(tmpNetS, mode = 'all', normalized = TRUE)$centralization,
centr_betw(tmpNetS, directed = FALSE, normalized = TRUE)$centralization)
if (length(E(gss))) {
starGSS = max(centr_degree(gss, loops = FALSE,
normalized = TRUE, mode = 'all')$centralization,
centr_clo(gss, mode = 'all', normalized = TRUE)$centralization,
centr_betw(gss, directed = FALSE, normalized = TRUE)$centralization)
} else starGSS = -1
(hubD = length(which(degree(tmpNetD) >= 0.6*max(degree(tmpNetD)))))
(hubS = length(which(degree(tmpNetS) >= 0.6*max(degree(tmpNetS)))))
# cat('is star if measure close to 1:', j, starS, hubS, '\n')
(hubGSS = length(which(degree(gss) >= 0.6*max(degree(gss)))))
qS = 2*ecount(tmpNetS)/(vcS*(vcS - 1)) # == edge_density(tmpNetS)
qD = 1*ecount(tmpNetD)/(vcD*(vcD - 1))
qGSS = ifelse(vcgss > 2, 2*ecount(gss)/(vcgss*(vcgss - 1)), 0)
mydf[i,1] = j
mydf[i,2] = vcD
mydf[i,3] = ecount(tmpNetD)
mydf[i,4] = is.connected(tmpNetD)
mydf[i,5] = ifelse((hubD) == 1, 1, 0)
mydf[i,6] = -log10(qD)
mydf[i,7] = ecount(tmpNetD)/vcD
mydf[i,8] = hubD
mydf[i,9] = vcS
mydf[i,10] = ecount(tmpNetS)
mydf[i,11] = starS
mydf[i,12] = ifelse((hubS) == 1, 1, 0)
mydf[i,13] = -log10(qS)
mydf[i,14] = ecount(tmpNetS)/vcS
mydf[i,15] = hubS
mydf[i,16] = vcgss
mydf[i,17] = ecount(gss)
mydf[i,18] = starGSS
mydf[i,19] = ifelse((hubGSS) == 1, 1, 0)
mydf[i,20] = ifelse(qGSS, -log10(qGSS), -1)
mydf[i,21] = ecount(gss)/vcgss
mydf[i,22] = hubGSS
mydf[i,23] = round(vcount(tmpNetD)/vcount(gss))
mydf[i,24] = ifelse(ecount(gss) != 0,
round(ecount(tmpNetD)/ecount(gss)),
-1)
mydf[i,25] = centr_degree(tmpNetD, loops = FALSE,
normalized = TRUE, mode = 'all')$centralization
mydf[i,26] = centr_clo(tmpNetD, mode = 'all',
normalized = TRUE)$centralization
mydf[i,27] = centr_betw(tmpNetD, directed = FALSE,
normalized = TRUE)$centralization
mydf[i,28] = centr_degree(tmpNetS, loops = FALSE,
normalized = TRUE, mode = 'all')$centralization
mydf[i,29] = centr_clo(tmpNetS, mode = 'all',
normalized = TRUE)$centralization
mydf[i,30] = centr_betw(tmpNetS, directed = FALSE,
normalized = TRUE)$centralization
mydf[i,31] = ifelse(ecount(gss) != 0,
centr_degree(gss, loops = FALSE,
normalized = TRUE, mode = 'all')$centralization,
-1)
mydf[i,32] = ifelse(ecount(gss) != 0,
centr_clo(gss, mode = 'all',
normalized = TRUE)$centralization,
-1)
mydf[i,33] = ifelse(ecount(gss) != 0,
centr_betw(gss, directed = FALSE,
normalized = TRUE)$centralization,
-1)
mydf[i,34] = -log10(edge_density(simplify(tmpNetD), loops = FALSE))
mydf[i,35] = -log10(edge_density(simplify(tmpNetS), loops = FALSE))
mydf[i,36] = ifelse(ecount(gss) != 0,
-log10(edge_density(simplify(gss), loops = FALSE)),
-1)
}
rownames(mydf) = mydf[,1]
write.table(mydf, paste0(dir1, '/', myname, '_mydf.txt'),
sep = '\t', row.names = FALSE, quote = FALSE)
mydf.se = (mydf[order(-mydf[,3], -mydf[,2]),] )
# print (mydf.se)
mydf.sv = (mydf[order(-mydf[,2], -mydf[,3]),] )
# print (mydf.sv)
dim(mydf)## [1] 24 36
indI = union(intersect(intersect(which(mydf[,11] > 0.60),
which(round(mydf[,14]) == 1)),
which(mydf[,30] > 0.60)),
which(mydf[,15] == 1))
myStars = as.vector(mydf[indI, 1])
denselyConnectedG = as.vector(mydf[as.vector(which(round(mydf[,11], digits = 2) == 0.00)),1])
unConnected = as.vector(mydf[as.vector(which(mydf[,4] == 0)),1])
myStars = setdiff(myStars, unConnected)
denselyConnectedG = setdiff(denselyConnectedG, unConnected)mat <- matrix(nrow = 0, ncol = 8)
colnames(mat) = c('nodeID',
'x', 'y',
'clu', 'origin',
'degreeSimplifiedSuperCluster', 'degreeSimplifiedCluster',
'isconnected')
head(mat)## nodeID x y clu origin degreeSimplifiedSuperCluster
## degreeSimplifiedCluster isconnected
myClusters = 1
V(g)$x = vector(mode = "double", vcount(g))
V(g)$y = vector(mode = "double", vcount(g))
# list.vertex.attributes(g)
# list.edge.attributes(g)cntV = 0
Vid = unlist(lapply(toSmallInd, function(x) cl[[x]]))
cnt = length(Vid)
if (cnt) {
mini = induced.subgraph(graph = g, vids = strtoi(match(Vid, cl$names)))
mydeg = degree(mini, loops = FALSE, normalized = FALSE, mode = "all")
for (l in 1:vcount(mini)) {
tmpvec = c()
tmpvec = c(V(mini)$nodeID[l], 0, 0, 0,
paste0('mini','_',l), mydeg[l], mydeg[l], is.connected(mini))
mat = rbind(mat,tmpvec)
}
myseq = paste0('mini','_',seq(1,vcount(mini),1))
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
print(cntV)
print(dim(mat))
# print(head(mat))
print(tail(mat))
print(myClusters)
}## [1] 0
## [1] 1 8
## nodeID x y clu origin degreeSimplifiedSuperCluster
## mini_1 "node9200" "0" "0" "0" "mini_1" "0"
## degreeSimplifiedCluster isconnected
## mini_1 "0" "TRUE"
## [1] 1
mydf = mydf[which(mydf[,1] %ni% c(myStars, unConnected, denselyConnectedG)),]
big.0.0 = mydf[union((which((mydf[,2] >= 2^10))),
(which(((mydf[,3] >= 2^11))))),]
dim(big.0.0)## [1] 0 36
normal.0.0 = mydf[which(as.vector(mydf[,1]) %ni% as.vector(big.0.0[,1])),]
dim(normal.0.0)## [1] 24 36
cntV = 0
tmplist = vector()
print(normal.0.0)## cliID vcountD ecountD connectedD starD -Log10qD evD hDegD vcountS
## 1 1 818 1411 1 0 2.675448 1.724939 77 818
## 21 21 779 1362 1 0 2.648340 1.748395 22 779
## 25 25 663 1128 1 0 2.590062 1.701357 137 663
## 16 16 622 986 1 0 2.593005 1.585209 102 622
## 19 19 574 921 1 0 2.552807 1.604530 100 574
## 17 17 556 881 1 0 2.544392 1.584532 25 556
## 11 11 528 809 1 0 2.536496 1.532197 28 528
## 10 10 494 769 1 0 2.500648 1.556680 89 494
## 22 22 492 771 1 0 2.495992 1.567073 65 492
## 7 7 420 601 1 0 2.466589 1.430952 34 420
## 6 6 399 584 1 0 2.434443 1.463659 37 399
## 18 18 393 587 1 0 2.419041 1.493639 48 393
## 9 9 377 565 1 0 2.399481 1.498674 43 377
## 12 12 359 523 1 0 2.390476 1.456825 38 359
## 3 3 335 479 1 0 2.368456 1.429851 30 335
## 23 23 324 456 1 0 2.360783 1.407407 27 324
## 4 4 314 441 1 0 2.348035 1.404459 27 314
## 14 14 303 419 1 0 2.339236 1.382838 75 303
## 24 24 260 369 1 0 2.261247 1.419231 26 260
## 2 2 209 272 1 0 2.203641 1.301435 9 209
## 20 20 209 273 1 0 2.202047 1.306220 11 209
## 13 13 207 281 1 0 2.181131 1.357488 44 207
## 8 8 201 276 1 0 2.163317 1.373134 15 201
## 5 5 163 215 1 0 2.089264 1.319018 9 163
## ecountS centralityS starS -Log10qS evS hDegS vcountGSS ecountGSS
## 1 1411 0.07755203 0 2.374418 1.724939 77 761 1354
## 21 1362 0.09578406 0 2.347310 1.748395 22 729 1312
## 25 1128 0.07325674 0 2.289032 1.701357 137 613 1078
## 16 986 0.09630405 0 2.291975 1.585209 102 569 933
## 19 921 0.07930952 0 2.251777 1.604530 100 531 878
## 17 881 0.08666384 0 2.243362 1.584532 25 495 820
## 11 809 0.08868037 0 2.235466 1.532197 28 463 744
## 10 769 0.09333869 0 2.199618 1.556680 89 434 709
## 22 771 0.10084908 0 2.194962 1.567073 65 442 721
## 7 601 0.08345350 0 2.165559 1.430952 34 373 554
## 6 584 0.10964195 0 2.133413 1.463659 37 358 543
## 18 587 0.07802414 0 2.118011 1.493639 48 347 541
## 9 565 0.08466311 0 2.098451 1.498674 43 342 530
## 12 523 0.08690310 0 2.089446 1.456825 38 316 480
## 3 479 0.08485870 0 2.067426 1.429851 30 294 438
## 23 456 0.11948614 0 2.059753 1.407407 27 274 406
## 4 441 0.09670127 0 2.047005 1.404459 27 272 399
## 14 419 0.06841070 0 2.038206 1.382838 75 266 382
## 24 369 0.10797993 0 1.960217 1.419231 26 225 334
## 2 272 0.17587260 0 1.902611 1.301435 9 175 238
## 20 273 0.14597233 0 1.901017 1.306220 11 179 243
## 13 281 0.11529948 0 1.880101 1.357488 44 180 254
## 8 276 0.10395698 0 1.862287 1.373134 15 180 255
## 5 215 0.12948666 0 1.788234 1.319018 9 141 193
## centralityGSS starGSS -Log10qGSS evGSS hDegGSS vShrinkage eShrinkage
## 1 0.07668090 0 2.329550 1.779238 70 1 1
## 21 0.09518207 0 2.305895 1.799726 18 1 1
## 25 0.07304050 0 2.240563 1.758564 121 1 1
## 16 0.09596173 0 2.238549 1.639719 95 1 1
## 19 0.07565329 0 2.204846 1.653484 88 1 1
## 17 0.08335365 0 2.173488 1.656566 19 1 1
## 11 0.08728631 0 2.157620 1.606911 24 1 1
## 10 0.08986917 0 2.122301 1.633641 80 1 1
## 22 0.09923110 0 2.130896 1.631222 54 1 1
## 7 0.07977477 0 2.097712 1.485255 27 1 1
## 6 0.10871183 0 2.070721 1.516760 32 1 1
## 18 0.07874113 0 2.045178 1.559078 38 1 1
## 9 0.08466011 0 2.041475 1.549708 39 1 1
## 12 0.08193071 0 2.015726 1.518987 34 1 1
## 3 0.07764662 0 1.992711 1.489796 25 1 1
## 23 0.11706706 0 1.964357 1.481752 21 1 1
## 4 0.09439917 0 1.965535 1.466912 23 1 1
## 14 0.06778322 0 1.965034 1.436090 154 1 1
## 24 0.10467975 0 1.877654 1.484444 23 1 1
## 2 0.17387701 0 1.805980 1.360000 30 1 1
## 20 0.15885591 0 1.816637 1.357542 6 1 1
## 13 0.11110572 0 1.802262 1.411111 37 1 1
## 8 0.09907679 0 1.800555 1.416667 42 1 1
## 5 0.12884898 0 1.708760 1.368794 30 1 1
## centr_degreeD centr_cloD centr_betwD centr_degreeS centr_cloS
## 1 0.004018468 0.07755203 0.02939440 0.008036936 0.07755203
## 21 0.004834526 0.09578406 0.04118809 0.009669052 0.09578406
## 25 0.003482776 0.07325674 0.03115021 0.006965552 0.07325674
## 16 0.003901096 0.09630405 0.04232968 0.007802192 0.09630405
## 19 0.004195194 0.07930952 0.03387500 0.008390388 0.07930952
## 17 0.005272059 0.08666384 0.04691946 0.010544118 0.08666384
## 11 0.005652917 0.08868037 0.05323433 0.011305835 0.08868037
## 10 0.004976170 0.09333869 0.04552081 0.009952341 0.09333869
## 22 0.004975269 0.10084908 0.05176426 0.009950538 0.10084908
## 7 0.004961688 0.08345350 0.06036542 0.009923376 0.08345350
## 6 0.006404820 0.10964195 0.07334113 0.012809640 0.10964195
## 18 0.005144449 0.07802414 0.05404493 0.010288898 0.07802414
## 9 0.005351064 0.08466311 0.04034741 0.010702128 0.08466311
## 12 0.005739167 0.08690310 0.07284604 0.011478334 0.08690310
## 3 0.006235277 0.08485870 0.07353921 0.012470554 0.08485870
## 23 0.008076457 0.11128458 0.11948614 0.016152914 0.11128458
## 4 0.008345621 0.09670127 0.07774151 0.016691243 0.09670127
## 14 0.005390420 0.06841070 0.05225712 0.010780841 0.06841070
## 24 0.008096136 0.10797993 0.08591225 0.016192272 0.10797993
## 2 0.010672148 0.11005431 0.17587260 0.021344296 0.11005431
## 20 0.010648922 0.09967458 0.14597233 0.021297845 0.09967458
## 13 0.008051148 0.10541781 0.11529948 0.016102297 0.10541781
## 8 0.010741206 0.10117089 0.10395698 0.021482412 0.10117089
## 5 0.013630090 0.10561737 0.12948666 0.027260179 0.10561737
## centr_betwS centr_degreeGSS centr_cloGSS centr_betwGSS
## 1 0.02939440 0.007178767 0.07668090 0.02689632
## 21 0.04118809 0.010193555 0.09518207 0.04315943
## 25 0.03115021 0.007348930 0.07304050 0.02769252
## 16 0.04232968 0.008340164 0.09596173 0.04327235
## 19 0.03387500 0.008888255 0.07565329 0.03354910
## 17 0.04691946 0.011558581 0.08335365 0.04505368
## 11 0.05323433 0.012578528 0.08728631 0.05256736
## 10 0.04552081 0.010980669 0.08986917 0.04378383
## 22 0.05176426 0.010791589 0.09923110 0.05044817
## 7 0.06036542 0.010890357 0.07977477 0.06272922
## 6 0.07334113 0.013989866 0.10871183 0.06791759
## 18 0.05404493 0.011284242 0.07874113 0.05568254
## 9 0.04034741 0.011505951 0.08466011 0.04331670
## 12 0.07284604 0.012657972 0.08193071 0.06513133
## 3 0.07353921 0.013815513 0.07764662 0.07042024
## 23 0.11948614 0.018584357 0.11176890 0.11706706
## 4 0.07774151 0.015115484 0.09439917 0.07072632
## 14 0.05225712 0.008090337 0.06778322 0.05544684
## 24 0.08591225 0.018157431 0.10467975 0.07891531
## 2 0.17587260 0.019068500 0.10991846 0.17387701
## 20 0.14597233 0.024344569 0.10409138 0.15885591
## 13 0.11529948 0.017952420 0.10534605 0.11110572
## 8 0.10395698 0.017889649 0.09854653 0.09907679
## 5 0.12948666 0.023638232 0.10563334 0.12884898
## -Log10edge_densityD -Log10edge_densityS -Log10edge_densityGSS
## 1 2.675448 2.374418 2.329550
## 21 2.648340 2.347310 2.305895
## 25 2.590062 2.289032 2.240563
## 16 2.593005 2.291975 2.238549
## 19 2.552807 2.251777 2.204846
## 17 2.544392 2.243362 2.173488
## 11 2.536496 2.235466 2.157620
## 10 2.500648 2.199618 2.122301
## 22 2.495992 2.194962 2.130896
## 7 2.466589 2.165559 2.097712
## 6 2.434443 2.133413 2.070721
## 18 2.419041 2.118011 2.045178
## 9 2.399481 2.098451 2.041475
## 12 2.390476 2.089446 2.015726
## 3 2.368456 2.067426 1.992711
## 23 2.360783 2.059753 1.964357
## 4 2.348035 2.047005 1.965535
## 14 2.339236 2.038206 1.965034
## 24 2.261247 1.960217 1.877654
## 2 2.203641 1.902611 1.805980
## 20 2.202047 1.901017 1.816637
## 13 2.181131 1.880101 1.802262
## 8 2.163317 1.862287 1.800555
## 5 2.089264 1.788234 1.708760
if (!is.null(dim(normal.0.0))) {
IND = as.vector(normal.0.0[,1])
} else {
IND = normal.0.0[1]
}
if (length(IND)) {
for (i in 1:length(IND)) {
print(IND[i])
ttmplist = vector()
tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
ttmplist = c(ttmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
moderateNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names))) #### #### #### ####
moderateNetGS = induced.subgraph(graph = gs,vids = strtoi(match(cl[[IND[i]]], cl$names))) #### #### #### ####
cat("i = ", i, "\n")
cat("# vertices", vcount(moderateNet), "\n")
cntV = cntV + vcount(moderateNet)
cat("# edges", ecount(moderateNet), "\n")
cat("#edges/#vertices", ecount(moderateNet)/vcount(moderateNet), "\n")
cat('\n')
if ((ecount(moderateNet) < 2^6) & (vcount(moderateNet) < 2^6)) {
l2 = layout.fruchterman.reingold(moderateNet, niter = 2^13)
} else {
l1 = layout_on_grid(moderateNet, dim = 2)
l2 = layout_with_kk(moderateNet, coords = l1, dim = 2,
maxiter = 999 * vcount(moderateNet),
epsilon = 0, kkconst = vcount(moderateNet),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(moderateNet)/vcount(moderateNet))
l2 = l2*2*z
}
if (vcount(moderateNet) >= 2^2) {
plot(0, type = "n",
#ann=FALSE,
axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('moderateNet_', myClusters),
ylab = paste0('clu: ',(IND)[i]))
plot(moderateNet, layout = l2,
edge.label = '',
vertex.label = V(moderateNet)$shortName,
vertex.color = 'gray60', #### #### #### ####
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.5,
edge.arrow.size = 0.25,
edge.arrow.width = 0.25,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.25,
# edge.arrow.mode = 0,
edge.label.cex = 0.25,
edge.curved = autocurve.edges(moderateNet))
}
###
mydeg = degree(moderateNet, loops = FALSE,
normalized = FALSE, mode = "all")
for (l in 1:vcount(moderateNet)) {
tmpvec = c()
tmpvec = c(V(moderateNet)$nodeID[l], l2[l,1], l2[l,2], myClusters,
paste0('normal','_',ttmplist[l]),
mydeg[l], mydeg[l], is.connected(moderateNet))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
}
myseq = paste0(rep('moderate'),'_', tmplist)
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
print(dim(mat))
# print(head(mat))
print(tail(mat))
print(myClusters)
}## [1] 1
## i = 1
## # vertices 818
## # edges 1411
## #edges/#vertices 1.724939
## [1] 21
## i = 2
## # vertices 779
## # edges 1362
## #edges/#vertices 1.748395
## [1] 25
## i = 3
## # vertices 663
## # edges 1128
## #edges/#vertices 1.701357
## [1] 16
## i = 4
## # vertices 622
## # edges 986
## #edges/#vertices 1.585209
## [1] 19
## i = 5
## # vertices 574
## # edges 921
## #edges/#vertices 1.60453
## [1] 17
## i = 6
## # vertices 556
## # edges 881
## #edges/#vertices 1.584532
## [1] 11
## i = 7
## # vertices 528
## # edges 809
## #edges/#vertices 1.532197
## [1] 10
## i = 8
## # vertices 494
## # edges 769
## #edges/#vertices 1.55668
## [1] 22
## i = 9
## # vertices 492
## # edges 771
## #edges/#vertices 1.567073
## [1] 7
## i = 10
## # vertices 420
## # edges 601
## #edges/#vertices 1.430952
## [1] 6
## i = 11
## # vertices 399
## # edges 584
## #edges/#vertices 1.463659
## [1] 18
## i = 12
## # vertices 393
## # edges 587
## #edges/#vertices 1.493639
## [1] 9
## i = 13
## # vertices 377
## # edges 565
## #edges/#vertices 1.498674
## [1] 12
## i = 14
## # vertices 359
## # edges 523
## #edges/#vertices 1.456825
## [1] 3
## i = 15
## # vertices 335
## # edges 479
## #edges/#vertices 1.429851
## [1] 23
## i = 16
## # vertices 324
## # edges 456
## #edges/#vertices 1.407407
## [1] 4
## i = 17
## # vertices 314
## # edges 441
## #edges/#vertices 1.404459
## [1] 14
## i = 18
## # vertices 303
## # edges 419
## #edges/#vertices 1.382838
## [1] 24
## i = 19
## # vertices 260
## # edges 369
## #edges/#vertices 1.419231
## [1] 2
## i = 20
## # vertices 209
## # edges 272
## #edges/#vertices 1.301435
## [1] 20
## i = 21
## # vertices 209
## # edges 273
## #edges/#vertices 1.30622
## [1] 13
## i = 22
## # vertices 207
## # edges 281
## #edges/#vertices 1.357488
## [1] 8
## i = 23
## # vertices 201
## # edges 276
## #edges/#vertices 1.373134
## [1] 5
## i = 24
## # vertices 163
## # edges 215
## #edges/#vertices 1.319018
## [1] 10000 8
## nodeID x y clu
## moderate_5_158 "node9694" "10.7438711990581" "2.35020881365661" "24"
## moderate_5_159 "node9900" "17.9753776799184" "14.8238428319454" "24"
## moderate_5_160 "node9901" "6.18628438612962" "13.1078200632064" "24"
## moderate_5_161 "node9937" "-0.483376095241185" "8.7875745242926" "24"
## moderate_5_162 "node9979" "12.1238113826569" "15.0392982147078" "24"
## moderate_5_163 "node9991" "3.91893323565064" "8.76043279722705" "24"
## origin degreeSimplifiedSuperCluster
## moderate_5_158 "normal_5_158" "1"
## moderate_5_159 "normal_5_159" "3"
## moderate_5_160 "normal_5_160" "3"
## moderate_5_161 "normal_5_161" "2"
## moderate_5_162 "normal_5_162" "4"
## moderate_5_163 "normal_5_163" "3"
## degreeSimplifiedCluster isconnected
## moderate_5_158 "1" "TRUE"
## moderate_5_159 "3" "TRUE"
## moderate_5_160 "3" "TRUE"
## moderate_5_161 "2" "TRUE"
## moderate_5_162 "4" "TRUE"
## moderate_5_163 "3" "TRUE"
## [1] 25
IND = myStars
cntV = 0
tmplist = vector()
if (length(IND)) {
for (i in 1:length(IND)) {
ttmplist = vector()
tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
ttmplist = c(ttmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
starNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names))) #### #### #### ####
cat("i = ", i, "\n")
cat("# vertices", vcount(starNet), "\n")
cntV = cntV + vcount(starNet)
cat("# edges", ecount(starNet), "\n")
cat("#edges/#vertices", ecount(starNet)/vcount(starNet), "\n")
cat('\n')
if ((ecount(starNet) < 2^6) & (vcount(starNet) < 2^6)) {
l2 = layout.fruchterman.reingold(starNet, niter = 2^13)
} else {
l1 = layout_on_grid(starNet, dim = 2)
l2 = layout_with_kk(starNet, coords = l1, dim = 2,
maxiter = 999 * vcount(starNet),
epsilon = 0, kkconst = vcount(starNet),
#weights = rep(100, length.out),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(starNet)/vcount(starNet))
l2 = l2*2*z
}
if ((vcount(starNet) >= 2^2) & (ecount(starNet) <= 2^11)) {
plot(0, type = "n", ann = TRUE, axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('starNet_', myClusters),
ylab = paste0('clu: ',(IND)[i]))
plot(starNet, layout = l2,
edge.label = '',
vertex.label = V(starNet)$shortName,
vertex.color = 'gray60', #### #### #### ####
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.5,
edge.arrow.size = 0.25,
edge.arrow.width = 0.25,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.25,
# edge.arrow.mode = 0,
edge.label.cex = 0.25,
edge.curved = autocurve.edges(starNet))
}
###
mydeg = degree(starNet, loops = FALSE,
normalized = FALSE, mode = "all")
for (l in 1:vcount(starNet)) {
tmpvec = c()
tmpvec = c(V(starNet)$nodeID[l], l2[l,1], l2[l,2], myClusters,
paste0('starNet','_',ttmplist[l]),
mydeg[l], mydeg[l], is.connected(starNet))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
}
myseq = paste0(rep('starNet'),'_', tmplist)
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
print(dim(mat))
# print(head(mat))
print(tail(mat))
print(myClusters)
}IND = denselyConnectedG
cntV = 0
tmplist = vector()
if (length(IND)) {
for (i in 1:length(IND)) {
ttmplist = vector()
tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
ttmplist = c(ttmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
denselyConnectedNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names))) #### #### #### ####
cat("i = ", i, "\n")
cat("# vertices", vcount(denselyConnectedNet), "\n")
cntV = cntV + vcount(denselyConnectedNet)
cat("# edges", ecount(denselyConnectedNet), "\n")
cat("#edges/#vertices",
ecount(denselyConnectedNet)/vcount(denselyConnectedNet),
"\n")
# cat ('duplicated connect rows:', '\n')
# print (cbind(V(denselyConnectedNet)$nodeID[get.edgelist(denselyConnectedNet)[duplicated(get.edgelist(denselyConnectedNet)),1]],
# V(denselyConnectedNet)$nodeID[get.edgelist(denselyConnectedNet)[duplicated(get.edgelist(denselyConnectedNet)),2]]))
cat('\n')
if ((ecount(denselyConnectedNet) < 2^6) & (vcount(denselyConnectedNet) < 2^6)) {
l2 = layout.fruchterman.reingold(denselyConnectedNet, niter = 2^13)
} else {
l1 = layout_on_grid(denselyConnectedNet, dim = 2)
l2 = layout_with_kk(denselyConnectedNet, coords = l1, dim = 2,
maxiter = 999 * vcount(denselyConnectedNet),
epsilon = 0, kkconst = vcount(denselyConnectedNet),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(denselyConnectedNet)/vcount(denselyConnectedNet))
l2 = l2*2*z
}
if ((vcount(denselyConnectedNet) >= 2^2) & (ecount(denselyConnectedNet) <= 2^11)) {
plot(0, type = "n", ann = TRUE, axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('denselyConnectedNet_', myClusters),
ylab = paste0('clu: ',(IND)[i]))
plot(denselyConnectedNet, layout = l2,
edge.label = '',
vertex.label = V(denselyConnectedNet)$shortName,
vertex.color = 'gray60', #### #### #### ####
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.5,
edge.arrow.size = 0.25,
edge.arrow.width = 0.25,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.25,
# edge.arrow.mode = 0,
edge.label.cex = 0.25,
edge.curved = autocurve.edges(denselyConnectedNet))
}
###
mydeg = degree(denselyConnectedNet, loops = FALSE,
normalized = FALSE, mode = "all")
for (l in 1:vcount(denselyConnectedNet)) {
tmpvec = c()
tmpvec = c(V(denselyConnectedNet)$nodeID[l], l2[l,1], l2[l,2], myClusters,
paste0('denselyConnectedNet','_',ttmplist[l]),
mydeg[l], mydeg[l], is.connected(denselyConnectedNet))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
}
myseq = paste0(rep('denselyConnectedNet'),'_', tmplist)
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
print(dim(mat))
# print(head(mat))
print(tail(mat))
print(myClusters)
}IND = unConnected
cntV = 0
tmplist = vector()
if (length(IND)) {
for (i in 1:length(IND)) {
tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
print(tmplist)
unConnectedNet = induced.subgraph(graph = g, vids = strtoi(match(cl[[IND[i]]], cl$names))) #### #### #### ####
mydegSC = degree(unConnectedNet, loops = FALSE, normalized = FALSE, mode = "all")
mydegSC = cbind(strtoi(match(cl[[IND[i]]], cl$names)), mydegSC)
graphs = decompose(unConnectedNet, mode = "weak")
# lapply(graphs,function(x) cat('######## ######## unlist(vcount(x))', unlist(vcount(x)), '\n'))
unConnectedNet = graphs[[which(unlist(lapply(graphs,function(x) vcount(x)))
==
max(unlist(lapply(graphs,function(x) vcount(x)))))]]
tempora = which(unlist(lapply(graphs,function(x) vcount(x))) != max(unlist(lapply(graphs,function(x) vcount(x))))) #### #### #### ####
if ((length(graphs) - 1) != 1) {
vids = as.vector(unlist(sapply(tempora,
function(x) match(V(graphs[[x]])$geneID, cl$names))))
unConnectedNetLeftovers = induced.subgraph(graph = g,vids = strtoi(vids))
} else {
vids = match(V(graphs[[tempora]])$geneID, cl$names)
unConnectedNetLeftovers = induced.subgraph(graph = g,vids = strtoi(vids))
}
cat("# vertices", vcount(unConnectedNetLeftovers), "\n")
cntV = cntV + vcount(unConnectedNetLeftovers)
cat("# edges", ecount(unConnectedNetLeftovers), "\n")
cat("#edges/#vertices",
ecount(unConnectedNetLeftovers)/vcount(unConnectedNetLeftovers),
"\n")
cat('\n')
if ((ecount(unConnectedNetLeftovers) < 2^6) & (vcount(unConnectedNetLeftovers) < 2^6)) {
l2 = layout.fruchterman.reingold(unConnectedNetLeftovers, niter = 2^13)
} else {
l1 = layout_on_grid(unConnectedNetLeftovers, dim = 2)
l2 = layout_with_kk(unConnectedNetLeftovers, coords = l1, dim = 2,
maxiter = 999 * vcount(unConnectedNetLeftovers),
epsilon = 0, kkconst = vcount(unConnectedNetLeftovers),
#weights = rep(100, length.out),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(unConnectedNetLeftovers)/vcount(unConnectedNetLeftovers))
l2 = l2*2*z
}
if ((vcount(unConnectedNetLeftovers) >= 2^2) & (ecount(unConnectedNetLeftovers) <= 2^11)) {
plot(0, type = "n", ann = TRUE, axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('unConnectedNetLeftovers_', myClusters),
ylab = paste0('clu: ',(IND)[i]))
plot(unConnectedNetLeftovers, layout = l2,
edge.label = '',
vertex.label = V(unConnectedNetLeftovers)$shortName,
vertex.color = 'gray60', #### #### #### ####
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.5,
edge.arrow.size = 0.25,
edge.arrow.width = 0.25,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.25,
# edge.arrow.mode = 0,
edge.label.cex = 0.25,
edge.curved = autocurve.edges(unConnectedNetLeftovers))
}
ttmplist = vector()
ttmplist = c(ttmplist, paste0((IND)[i],
'_unConnectedNetLeftovers_',
seq(1,vcount(unConnectedNetLeftovers),1)))
print(ttmplist)
mydeg = degree(unConnectedNetLeftovers, loops = FALSE,
normalized = FALSE, mode = "all")
mydegSCucl = mydegSC[which(!is.na(match(mydegSC[,1],
V(unConnectedNetLeftovers)$numVec))),2]
for (l in 1:vcount(unConnectedNetLeftovers)) {
tmpvec = c()
tmpvec = c(V(unConnectedNetLeftovers)$nodeID[l], l2[l,1], l2[l,2], myClusters,
paste0(ttmplist[l]),
mydegSCucl[l], mydeg[l],
is.connected(unConnectedNetLeftovers))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
tmpNetS = as.undirected(unConnectedNet, mode = "each")
tmpNetS = simplify(tmpNetS,
remove.multiple = TRUE,
remove.loops = TRUE,
edge.attr.comb = "concat")
mysn = length(which(degree(tmpNetS) >= 0.6*max(degree(tmpNetS))))
if (mysn < 2) {
# mysn = 2
# print('Problemos')
cat('##########\n')
mysubg = unConnectedNet
cat('connected\t', is.connected(mysubg), '\n')
cat("# vertices", vcount(mysubg), "\n")
cat("# edges", ecount(mysubg), "\n")
cat("#edges/#vertices", ecount(mysubg)/vcount(mysubg), "\n")
q = ecount(mysubg)/(vcount(mysubg)*(vcount(mysubg) - 1))
cat("#edges/#MAXedges", q, "\n")
if ((ecount(mysubg) < 2^6) & (vcount(mysubg) < 2^6)) {
l2 = layout.fruchterman.reingold(mysubg, niter = 2^13)
} else {
l1 = layout_on_grid(mysubg, dim = 2)
l2 = layout_with_kk(mysubg, coords = l1, dim = 2,
maxiter = 999 * vcount(mysubg),
epsilon = 0, kkconst = vcount(mysubg),
#weights = rep(100, length.out),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(mysubg)/vcount(mysubg))
l2 = l2*2*z
}
if ((vcount(mysubg) >= 2^2) & (ecount(mysubg) <= 2^11)) {
plot(0, type = "n", ann = TRUE, axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('origigi; clu: ', IND[i]),
ylab = paste0('subclu: ',j))
plot(mysubg,
layout = l2,
vertex.color = 'gray60', #### #### #### ####
edge.label = '',
vertex.label = V(mysubg)$shortName,
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.25,
edge.arrow.size = 0.025,
edge.arrow.width = 0.025,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.025,
# edge.arrow.mode = 0,
edge.label.cex = 0.025,
edge.curved = autocurve.edges(mysubg))
}
ttmplist = vector()
ttmplist = c(ttmplist, paste0((IND)[i],
'_unConnectedNet_mysubg_',
j,
'_',
seq(1,vcount(mysubg),1)))
print(ttmplist)
mydeg = degree(mysubg, loops = FALSE,
normalized = FALSE, mode = "all")
mydegSCs = mydegSC[which(!is.na(match(mydegSC[,1], V(mysubg)$numVec))),2]
for (l in 1:vcount(mysubg)) {
tmpvec = c()
tmpvec = c(V(mysubg)$nodeID[l], l2[l,1], l2[l,2], myClusters,
ttmplist[l], mydegSCs[l], mydeg[l], is.connected(mysubg))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
} else {
cat('mysn', mysn, '\n\n')
########## ########## ########## ########## ##########
# set.seed(123456)
clspin = spinglass.community(unConnectedNet, spins = mysn)
cntV = cntV + vcount(unConnectedNet)
cat("# spins", mysn, "\n")
df = (as.data.frame(table(clspin$membership)))
print((df))
df[,1] = strtoi(df[,1])
df[,2] = strtoi(df[,2])
for (j in df[,1]) {
cat('##########\n')
mysubg = induced.subgraph(graph = unConnectedNet, vids = strtoi(match(clspin[[j]], clspin$names))) #### #### #### ####
cat('connected\t', is.connected(mysubg), '\n')
cat("# vertices", vcount(mysubg), "\n")
cat("# edges", ecount(mysubg), "\n")
cat("#edges/#vertices", ecount(mysubg)/vcount(mysubg), "\n")
q = ecount(mysubg)/(vcount(mysubg)*(vcount(mysubg) - 1))
cat("#edges/#MAXedges", q, "\n")
if ((ecount(mysubg) < 2^6) & (vcount(mysubg) < 2^6)) {
l2 = layout.fruchterman.reingold(mysubg, niter = 2^13)
} else {
l1 = layout_on_grid(mysubg, dim = 2)
l2 = layout_with_kk(mysubg, coords = l1, dim = 2,
maxiter = 999 * vcount(mysubg),
epsilon = 0, kkconst = vcount(mysubg),
#weights = rep(100, length.out),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(mysubg)/vcount(mysubg))
l2 = l2*2*z
}
if ((vcount(mysubg) >= 2^2) & (ecount(mysubg) <= 2^11)) {
plot(0, type = "n", ann = TRUE, axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('origigi; clu: ', IND[i]),
ylab = paste0('subclu: ',j))
plot(mysubg,
layout = l2,
vertex.color = 'gray60', #### #### #### ####
edge.label = '',
vertex.label = V(mysubg)$shortName,
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.25,
edge.arrow.size = 0.025,
edge.arrow.width = 0.025,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.025,
# edge.arrow.mode = 0,
edge.label.cex = 0.025,
edge.curved = autocurve.edges(mysubg))
}
ttmplist = vector()
ttmplist = c(ttmplist, paste0((IND)[i],
'_unConnectedNet_mysubg_',
j,
'_',
seq(1,vcount(mysubg),1)))
print(ttmplist)
mydeg = degree(mysubg, loops = FALSE,
normalized = FALSE, mode = "all")
mydegSCs = mydegSC[which(!is.na(match(mydegSC[,1], V(mysubg)$numVec))),2]
for (l in 1:vcount(mysubg)) {
tmpvec = c()
tmpvec = c(V(mysubg)$nodeID[l], l2[l,1], l2[l,2], myClusters,
ttmplist[l], mydegSCs[l], mydeg[l], is.connected(mysubg))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
}
###
}
}
myseq = paste0(rep('unConnectedNet'),'_', tmplist)
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
# any(duplicated(mat[grep('unConnectedNet',rownames(mat)),1]))
# bla = data.frame(mat[which(duplicated(mat[,1])),c(1,4)])
# bla = bla[with(bla, order(bla[,1])), ]
print(dim(mat))
# print(head(mat))
print(tail(mat))
print(myClusters)
}save.image(paste0(dir1, '/', myname, 'pt1.RData'))
# load(paste0(dir1, '/', myname, 'pt1.RData'))#######
cntV = 0
tmplist = vector()
IND = as.vector(big.0.0[,1])
if (length(IND)) {
for (i in 1:length(IND)) {
tmplist = c(tmplist, paste0((IND)[i],'_',seq(1,(length(cl[[IND[i]]])),1)))
cat('i = ', i, "\n")
ind = strtoi(match(cl[[IND[i]]], cl$names)) #### #### #### ####
entryGraph = induced.subgraph(graph = g,vids = ind)
cat("# vertices", vcount(entryGraph), "\n")
cat("# edges", ecount(entryGraph), "\n")
cat("#edges/#vertices", ecount(entryGraph)/vcount(entryGraph), "\n")
mydegSC = degree(entryGraph, loops = FALSE,
normalized = FALSE, mode = "all")
mydegSC = cbind(ind, mydegSC)
mysn = mydf[which(mydf[,1] == IND[i]),15]
if (mysn < 2) {
mysn = 2
}
cat('mysn', mysn, '\n\n')
########## ########## ########## ########## ##########
########## ########## ########## ########## ##########
# set.seed(123456)
clspin = spinglass.community(entryGraph, spins = mysn)
cntV = cntV + vcount(entryGraph)
df = (as.data.frame(table(clspin$membership)))
print((df))
df[,1] = strtoi(df[,1])
df[,2] = strtoi(df[,2])
cat("# spins", dim(df)[1], "\n")
for (j in df[,1]) {
cat('##########\n')
mysubg = induced.subgraph(graph = entryGraph, vids = strtoi(match(clspin[[j]], clspin$names))) #### #### #### ####
cat('connected\t', is.connected(mysubg), '\n')
cat("# vertices", vcount(mysubg), "\n")
cat("# edges", ecount(mysubg), "\n")
cat("#edges/#vertices", ecount(mysubg)/vcount(mysubg), "\n")
q = ecount(mysubg)/(vcount(mysubg)*(vcount(mysubg) - 1))
cat("#edges/#MAXedges", q, "\n")
if ((ecount(mysubg) < 2^6) & (vcount(mysubg) < 2^6)) {
l2 = layout.fruchterman.reingold(mysubg, niter = 2^13)
} else {
l1 = layout_on_grid(mysubg, dim = 2)
l2 = layout_with_kk(mysubg, coords = l1, dim = 2,
maxiter = 999 * vcount(mysubg),
epsilon = 0, kkconst = vcount(mysubg),
minx = NULL, maxx = NULL,
miny = NULL, maxy = NULL,
minz = NULL,maxz = NULL)
z = round(ecount(mysubg)/vcount(mysubg))
l2 = l2*2*z
}
if ((vcount(mysubg) >= 2^2) & (ecount(mysubg) <= 2^11)) {
plot(0, type = "n", ann = TRUE, axes = FALSE,
xlim = extendrange(l2[,1]),
ylim = extendrange(l2[,2]),
xlab = paste0('origigi; clu: ', IND[i]),
ylab = paste0('subclu: ',j))
plot(mysubg,
layout = l2,
vertex.color = 'gray60', #### #### #### ####
#edge.label=E(mysubg)$intType,
edge.label = '',
vertex.label = V(mysubg)$shortName, #$X12_shortName,
rescale = FALSE, add = TRUE,
vertex.label.cex = 0.25,
edge.arrow.size = 0.025,
edge.arrow.width = 0.025,
edge.lty = 'solid',
edge.color = 'gray',
edge.width = 0.025,
# edge.arrow.mode = 0,
edge.label.cex = 0.025,
edge.curved = autocurve.edges(mysubg))
}
ttmplist = vector()
ttmplist = c(ttmplist, paste0((IND)[i],
'_big_mysubg_',
j,
'_',
seq(1,vcount(mysubg),1)))
mydeg = degree(mysubg, loops = FALSE,
normalized = FALSE, mode = "all")
mydegSCs = mydegSC[which(!is.na(match(mydegSC[,1], V(mysubg)$numVec))),2]
for (l in 1:vcount(mysubg)) {
tmpvec = c()
tmpvec = c(V(mysubg)$nodeID[l], l2[l,1], l2[l,2], myClusters,
ttmplist[l], mydegSCs[l], mydeg[l], is.connected(mysubg))
mat = rbind(mat,tmpvec)
}
myClusters = myClusters + 1
}
###
}
######## ######## ######## ########
}
myseq = paste0(rep('big'),'_', tmplist)
rownames(mat)[which(rownames(mat) == 'tmpvec')] = myseq
print(dim(mat))## [1] 10000 8
# print(head(mat))
print(tail(mat))## nodeID x y clu
## moderate_5_158 "node9694" "10.7438711990581" "2.35020881365661" "24"
## moderate_5_159 "node9900" "17.9753776799184" "14.8238428319454" "24"
## moderate_5_160 "node9901" "6.18628438612962" "13.1078200632064" "24"
## moderate_5_161 "node9937" "-0.483376095241185" "8.7875745242926" "24"
## moderate_5_162 "node9979" "12.1238113826569" "15.0392982147078" "24"
## moderate_5_163 "node9991" "3.91893323565064" "8.76043279722705" "24"
## origin degreeSimplifiedSuperCluster
## moderate_5_158 "normal_5_158" "1"
## moderate_5_159 "normal_5_159" "3"
## moderate_5_160 "normal_5_160" "3"
## moderate_5_161 "normal_5_161" "2"
## moderate_5_162 "normal_5_162" "4"
## moderate_5_163 "normal_5_163" "3"
## degreeSimplifiedCluster isconnected
## moderate_5_158 "1" "TRUE"
## moderate_5_159 "3" "TRUE"
## moderate_5_160 "3" "TRUE"
## moderate_5_161 "2" "TRUE"
## moderate_5_162 "4" "TRUE"
## moderate_5_163 "3" "TRUE"
print(myClusters - 1)## [1] 24
save.image(paste0(dir1, '/', myname, 'pt2.RData'))
# load(paste0(dir1, '/', myname, 'pt2.RData'))mydataframe = as.data.frame(mat)
mydataframe[,1] = (as.character(((mydataframe[,1]))))
mydataframe[,2] = as.numeric(as.character(((mydataframe[,2]))))
mydataframe[,3] = as.numeric(as.character((mydataframe[,3])))
mydataframe[,4] = strtoi(as.character(mydataframe[,4]))
mydataframe[,5] = (as.character(((mydataframe[,5]))))
mydataframe[,6] = strtoi(as.character(mydataframe[,6]))
mydataframe[,7] = strtoi(as.character(mydataframe[,7]))
mydataframe[,8] = (as.character(((mydataframe[,8]))))
head(mat)## nodeID x y clu
## mini_1 "node9200" "0" "0" "0"
## moderate_1_1 "node11" "83.5057774954021" "91.2687637225665" "1"
## moderate_1_2 "node14" "64.5110945146726" "111.3011542721" "1"
## moderate_1_3 "node24" "30.1632693020225" "91.806537536431" "1"
## moderate_1_4 "node29" "12.7429413492416" "93.9103109620578" "1"
## moderate_1_5 "node35" "27.5134177465723" "92.4479709492698" "1"
## origin degreeSimplifiedSuperCluster
## mini_1 "mini_1" "0"
## moderate_1_1 "normal_1_1" "3"
## moderate_1_2 "normal_1_2" "2"
## moderate_1_3 "normal_1_3" "2"
## moderate_1_4 "normal_1_4" "2"
## moderate_1_5 "normal_1_5" "4"
## degreeSimplifiedCluster isconnected
## mini_1 "0" "TRUE"
## moderate_1_1 "3" "TRUE"
## moderate_1_2 "2" "TRUE"
## moderate_1_3 "2" "TRUE"
## moderate_1_4 "2" "TRUE"
## moderate_1_5 "4" "TRUE"
head(mydataframe)## nodeID x y clu origin
## mini_1 node9200 0.00000 0.00000 0 mini_1
## moderate_1_1 node11 83.50578 91.26876 1 normal_1_1
## moderate_1_2 node14 64.51109 111.30115 1 normal_1_2
## moderate_1_3 node24 30.16327 91.80654 1 normal_1_3
## moderate_1_4 node29 12.74294 93.91031 1 normal_1_4
## moderate_1_5 node35 27.51342 92.44797 1 normal_1_5
## degreeSimplifiedSuperCluster degreeSimplifiedCluster
## mini_1 0 0
## moderate_1_1 3 3
## moderate_1_2 2 2
## moderate_1_3 2 2
## moderate_1_4 2 2
## moderate_1_5 4 4
## isconnected
## mini_1 TRUE
## moderate_1_1 TRUE
## moderate_1_2 TRUE
## moderate_1_3 TRUE
## moderate_1_4 TRUE
## moderate_1_5 TRUE
typeof(mat[,2])## [1] "character"
typeof(mydataframe[,2])## [1] "double"
mat = mydataframeplot(mat[,4])hist(strtoi(mat[,4]), breaks = myClusters)(all(mat[,1] %in% V(g)$geneID))## [1] TRUE
(all(V(g)$geneID %in% mat[,1]))## [1] TRUE
length(intersect(V(g)$nodeID, mat[,1]))## [1] 10000
dim(mat)[1] == vcount(g)## [1] TRUE
vcount(g) - dim(mat)[1]## [1] 0
myind = match(mat[,1],V(g)$nodeID)
mynumvec = strtoi(V(g)$numVec[myind])
mat.t = (as.data.frame(cbind(mat,as.numeric(mynumvec)), stringsAsFactors = FALSE))
colnames(mat.t)[dim(mat.t)[2]] = 'mynumvec'
mat.t[,dim(mat.t)[2]] = as.numeric(mat.t[,dim(mat.t)[2]])
# head(mat.t)
# tail(mat.t)
mat.sorted = mat.t[with(mat.t, order(mynumvec)), ]
head(mat.sorted)## nodeID x y clu origin
## moderate_13_1 node1 9.247902 16.00439 22 normal_13_1
## moderate_17_1 node2 3.059256 41.71142 6 normal_17_1
## moderate_19_1 node3 35.601202 14.26304 5 normal_19_1
## moderate_4_1 node4 21.648699 11.31966 17 normal_4_1
## moderate_25_1 node5 47.105806 28.83630 3 normal_25_1
## moderate_2_1 node6 2.319479 12.49798 20 normal_2_1
## degreeSimplifiedSuperCluster degreeSimplifiedCluster
## moderate_13_1 3 3
## moderate_17_1 3 3
## moderate_19_1 3 3
## moderate_4_1 3 3
## moderate_25_1 4 4
## moderate_2_1 3 3
## isconnected mynumvec
## moderate_13_1 TRUE 1
## moderate_17_1 TRUE 2
## moderate_19_1 TRUE 3
## moderate_4_1 TRUE 4
## moderate_25_1 TRUE 5
## moderate_2_1 TRUE 6
tail(mat.sorted)## nodeID x y clu origin
## moderate_10_494 node9995 5.342874 35.28369 8 normal_10_494
## moderate_1_818 node9996 41.370315 91.92627 1 normal_1_818
## moderate_20_209 node9997 11.730375 21.90694 21 normal_20_209
## moderate_17_556 node9998 23.382699 60.30320 6 normal_17_556
## moderate_21_779 node9999 39.172314 33.55678 2 normal_21_779
## moderate_6_399 node10000 24.265273 25.87456 11 normal_6_399
## degreeSimplifiedSuperCluster degreeSimplifiedCluster
## moderate_10_494 1 1
## moderate_1_818 4 4
## moderate_20_209 2 2
## moderate_17_556 3 3
## moderate_21_779 3 3
## moderate_6_399 5 5
## isconnected mynumvec
## moderate_10_494 TRUE 9995
## moderate_1_818 TRUE 9996
## moderate_20_209 TRUE 9997
## moderate_17_556 TRUE 9998
## moderate_21_779 TRUE 9999
## moderate_6_399 TRUE 10000
write.table(mat.sorted, paste0(dir1, '/', myname, "_id_coord_clu.tsv"),
sep = "\t", quote = FALSE, row.names = TRUE, col.names = TRUE)# all vertex attributes
# list.vertex.attributes(g)
v = list.vertex.attributes(g)
# for (i in v) {
# print(head(get.vertex.attribute(g,i)))
# }
# all edge attributes
# list.edge.attributes(g)
e = list.edge.attributes(g)
# for (j in e) {
# print(head(get.edge.attribute(g,j)))
# }
setdiff(V(g)$geneID, mat.sorted$nodeID)## character(0)
length(mat.sorted$mynumvec)## [1] 10000
min(mat.sorted$mynumvec)## [1] 1
max(mat.sorted$mynumvec)## [1] 10000
missing = setdiff(seq(1,max(mat.sorted$mynumvec),1), mat.sorted$mynumvec)
myind = match(V(g)$nodeID, mat.sorted[,1])
noInfo = which(is.na(myind))
if (length(noInfo)) {
for (i in 1:length(noInfo)) {
print(i)
tmpvec = c(V(g)$nodeID[noInfo[i]], 0, 0, 0, '-', 0, 'FALSE', noInfo[i])
mat.sorted = rbind(mat.sorted, tmpvec)
}
}
mat.sorted$mynumvec = strtoi( mat.sorted$mynumvec)
mat.sorted = mat.sorted[with(mat.sorted , order(mynumvec)), ]
any(!(V(g)$nodeID %in% mat.sorted$nodeID))## [1] FALSE
any(!(mat.sorted$nodeID %in% V(g)$nodeID))## [1] FALSE
all(mat.sorted$mynumvec[noInfo] == noInfo)## [1] TRUE
myind = myind[which(!(is.na(myind)))]
myind = match(V(g)$nodeID, mat.sorted[,1])
V(g)$superClu <- V(g)$membership
V(g)$membership <- (mat.sorted$clu)
V(g)$x <- as.numeric(mat.sorted$x)
V(g)$y <- as.numeric(mat.sorted$y)
V(g)$cluOrigin <- mat.sorted$origin
V(g)$degreeSimplifiedCluster <- mat.sorted$degreeSimplifiedCluster
V(g)$degreeSimplifiedSuperCluster <- mat.sorted$degreeSimplifiedSuperCluster
V(g)$isconnected <- mat.sorted$isconnected
V(g)$exists <- rep(1, vcount(g))
# list.vertex.attributes(g)
E(g)$cluA = V(g)$membership[match(E(g)$geneID1, V(g)$geneID)]
E(g)$cluB = V(g)$membership[match(E(g)$geneID2, V(g)$geneID)]
E(g)$superCluA = V(g)$superClu[match(E(g)$geneID1, V(g)$geneID)]
E(g)$superCluB = V(g)$superClu[match(E(g)$geneID2, V(g)$geneID)]
E(g)$cluOriginA = V(g)$cluOrigin[match(E(g)$geneID1, V(g)$geneID)]
E(g)$cluOriginB = V(g)$cluOrigin[match(E(g)$geneID2, V(g)$geneID)]
E(g)$matchClu = (E(g)$cluA == E(g)$cluB)
E(g)$degreeFullSimplifiedA = V(g)$degreeFullSimplified[match(E(g)$geneID1, V(g)$geneID)]
E(g)$degreeFullSimplifiedB = V(g)$degreeFullSimplified[match(E(g)$geneID2, V(g)$geneID)]
E(g)$degreeSimplifiedSuperClusterA = V(g)$degreeSimplifiedSuperCluster[match(E(g)$geneID1, V(g)$geneID)]
E(g)$degreeSimplifiedSuperClusterB = V(g)$degreeSimplifiedSuperCluster[match(E(g)$geneID2, V(g)$geneID)]
E(g)$degreeSimplifiedClusterA = V(g)$degreeSimplifiedCluster[match(E(g)$geneID1, V(g)$geneID)]
E(g)$degreeSimplifiedClusterB = V(g)$degreeSimplifiedCluster[match(E(g)$geneID2, V(g)$geneID)]
E(g)$exists <- rep(1, ecount(g))
# list.edge.attributes(g)
# first check if theese attributes exist
v = list.vertex.attributes(g)
e = list.edge.attributes(g)
df1 = matrix(NA, dim(mat.sorted)[1], length(v))
for (i in 1:length(v)) {
df1[,i] = get.vertex.attribute(g,v[i])
}
myind = match(V(g)$nodeID, mat.sorted[,1])
myind = myind[!(is.na(myind))]
df1 = as.data.frame(df1, stringsAsFactors = FALSE)
colnames(df1) = v
# str(df1)
# which columns!!!???!!!
colNames = toupper(c('geneID', 'shortDescription', 'shortName', 'MapManBin',
'numVec', 'superClu', 'membership', 'cluOrigin', 'x', 'y',
'degreeFullSimplified',
'degreeSimplifiedSuperCluster',
'degreeSimplifiedCluster',
'isconnected',
'exists'))
importantColsE = unlist(sapply(colNames,
function(x) grep(paste("^",x,"$", sep = ""),
toupper((v)))))
df1 = df1[,importantColsE]
colnames(df1) = c('geneID',
'shortDescription',
'shortName',
'MapManBin',
'sortingOrder',
'superClusterID',
'clusterID',
'clusterOrigin',
'x',
'y',
'networkSimplifiedNodeDegree',
'superClusterSimplifiedNodeDegree',
'clusterSimplifiedNodeDegree',
'isConnected',
'expressed')
write.table(df1, paste0(dir1, '/', myname, "_NODES.tsv"),
quote = FALSE,
row.names = FALSE,
sep = "\t")
df2 = matrix(NA, ecount(g), length(e))
for (i in 1:length(e)) {
df2[,i] = get.edge.attribute(g,e[i])
}
df2 = as.data.frame(df2, stringsAsFactors = FALSE)
colnames(df2) = e
# str(df2)
# which columns!!!???!!!
colNames = toupper(c('geneID1', 'geneID2', 'reactionType',
'cluA', 'cluB',
'superCluA', 'superCluB',
'cluOriginA', 'cluOriginB', 'matchClu',
'degreeFullSimplifiedA', 'degreeFullSimplifiedB' ,
'degreeSimplifiedSuperClusterA', 'degreeSimplifiedSuperClusterB',
'degreeSimplifiedClusterA' ,'degreeSimplifiedClusterB',
'exists'))
importantColsE = unlist(sapply(colNames,
function(x) grep(paste("^",x,"$", sep = ""),
toupper((e)))))
df2 = df2[,importantColsE]
colnames(df2) = c('geneID1',
'geneID2',
'reactionType',
'clusterID_geneID1',
'clusterID_geneID2',
'superClusterID_geneID1',
'superClusterID_geneID2',
'clusterOrigin_geneID1',
'clusterOrigin_geneID2',
'matchingClusters',
'networkSimplifiedNodeDegree_geneID1',
'networkSimplifiedNodeDegree_geneID2',
'superClusterSimplifiedNodeDegree_geneID1',
'superClusterSimplifiedNodeDegree_geneID2',
'clusterSimplifiedNodeDegree_geneID1',
'clusterSimplifiedNodeDegree_geneID2',
'exists')
write.table(df2, paste0(dir1, '/', myname, "_EDGES_all.tsv"),
quote = FALSE,
row.names = FALSE,
sep = "\t")
df3 = df2[which(df2$matchingClusters == TRUE),]
write.table(df3, paste0(dir1, '/', myname, "_EDGES_byClu.tsv"),
quote = FALSE,
row.names = FALSE,
col.names = TRUE,
sep = "\t")save.image(paste0(dir1, '/', myname,'_clu_and_coord.RData'))
exectime <- toc()## 432.01 sec elapsed
print(exectime)## $tic
## elapsed
## 1.61
##
## $toc
## elapsed
## 433.62
##
## $msg
## logical(0)
http://r.789695.n4.nabble.com/igraph-and-plotting-connected-components-td836537.html